Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(io-engine): add persistent store transaction API #1791

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

dsharma-dc
Copy link
Contributor

This change adds API to do Compare-and-Swap operations to the persistent store via io-engine.

@dsharma-dc
Copy link
Contributor Author

bors try

bors-openebs-mayastor bot pushed a commit that referenced this pull request Dec 20, 2024
@bors-openebs-mayastor
Copy link

try

Build succeeded:

self.0
.txn(Txn::new().when(cmps).and_then(ops_success).or_else(fops))
.await
.context(TxnErr {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we get an error on the logs which is clear as to what happened?

Copy link
Contributor Author

@dsharma-dc dsharma-dc Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we get an error on the logs which is clear as to what happened?

It's difficult to get complete info as to what failed as part of transaction. The succeeded() validation in transaction response is deterministic info of whether transaction compares succeed or fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what does it look like on the logs with this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean from io-engine logging? or api usage. For io-engine it'll be something like below, this is not final yet:

[2025-01-08T12:05:37.051454598+00:00  INFO io_engine::persistent_store:persistent_store.rs:208] Executing transaction for key /openebs.io/mayastor/apis/v0/clusters/761ab476f4f4489b900ded45a464a79d/namespaces/default/volume/1c4e66fd-3b33-4439-b504-d49aba53da26/nexus/2f2455c8-e237-4586-848d-762f26ebb044/info.
[2025-01-08T12:05:37.052938551+00:00 ERROR io_engine::bdev::nexus::nexus_persistence:nexus_persistence.rs:216] Nexus '1c4e66fd-3b33-4439-b504-d49aba53da26' [open]: failed to update persistent store, will shutdown the nexus: failed to save nexus state 1c4e66fd-3b33-4439-b504-d49aba53da26

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to have some more context here on the failure.
Also I hope we distinguish compare errors from general etcd write failures which should be retried?

Copy link
Contributor Author

@dsharma-dc dsharma-dc Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to have some more context here on the failure. Also I hope we distinguish compare errors from general etcd write failures which should be retried?

Yes the general error if received as Err from txn API has to be dealt with retries by caller. However if a Put TxnOp failure happens after a successful compare, then the etcd handles it harshly on server side:

	_, err := executeTxn(ctx, lg, txnWrite, rt, txnPath, txnResp)
	if err != nil {
		if isWrite {
			// CAUTION: When a txn performing write operations starts, we always expect it to be successful.
			// If a write failure is seen we SHOULD NOT try to recover the server, but crash with a panic to make the failure explicit.
			// Trying to silently recover (e.g by ignoring the failed txn or calling txn.End() early) poses serious risks:
			// - violation of transaction atomicity if some write operations have been partially executed
			// - data inconsistency across different etcd members if they applied the txn asymmetrically
			lg.Panic("unexpected error during txn with writes", zap.Error(err))
		} else {
			lg.Error("unexpected error during readonly txn", zap.Error(err))
		}
	}

Copy link
Contributor Author

@dsharma-dc dsharma-dc Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to have some more context here on the failure.

Have extended caller side error handling(not in this PR) by printing source in StateSaveFailed error to add more context in error, as below.

[2025-01-09T11:29:00.330005232+00:00  INFO io_engine::persistent_store:persistent_store.rs:208] Executing transaction for key /openebs.io/mayastor/apis/v0/clusters/761ab476f4f4489b900ded45a464a79d/namespaces/default/volume/1c4e66fd-3b33-4439-b504-d49aba53da26/nexus/486a4605-1004-409b-b24c-9786c8ae279d/info.
[2025-01-09T11:29:00.331331260+00:00 ERROR io_engine::bdev::nexus::nexus_persistence:nexus_persistence.rs:216] Nexus '1c4e66fd-3b33-4439-b504-d49aba53da26' [open]: failed to update persistent store, will shutdown the nexus: failed to save nexus state 1c4e66fd-3b33-4439-b504-d49aba53da26, Failed to do 'transaction' for key /openebs.io/mayastor/apis/v0/clusters/761ab476f4f4489b900ded45a464a79d/namespaces/default/volume/1c4e66fd-3b33-4439-b504-d49aba53da26/nexus/486a4605-1004-409b-b24c-9786c8ae279d/info. Error io error: Txn CompareOp failed

io-engine/src/store/etcd.rs Show resolved Hide resolved
io-engine/src/store/etcd.rs Show resolved Hide resolved
This change adds API to do Compare-and-Swap operations
to the persistent store via io-engine.

Signed-off-by: Diwakar Sharma <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants